malicious comment
Group-Adaptive Adversarial Learning for Robust Fake News Detection Against Malicious Comments
Tong, Zhao, Gong, Chunlin, Gu, Yimeng, Shi, Haichao, Liu, Qiang, Wu, Shu, Zhang, Xiao-Yu
The spread of fake news online distorts public judgment and erodes trust in social media platforms. Although recent fake news detection (FND) models perform well in standard settings, they remain vulnerable to adversarial comments-authored by real users or by large language models (LLMs)-that subtly shift model decisions. In view of this, we first present a comprehensive evaluation of comment attacks to existing fake news detectors and then introduce a group-adaptive adversarial training strategy to improve the robustness of FND models. To be specific, our approach comprises three steps: (1) dividing adversarial comments into three psychologically grounded categories: perceptual, cognitive, and societal; (2) generating diverse, category-specific attacks via LLMs to enhance adversarial training; and (3) applying a Dirichlet-based adaptive sampling mechanism (InfoDirichlet Adjusting Mechanism) that dynamically adjusts the learning focus across different comment categories during training. Experiments on benchmark datasets show that our method maintains strong detection accuracy while substantially increasing robustness to a wide range of adversarial comment perturbations.
- North America > United States > Minnesota (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Asia > South Korea (0.04)
- Media > News (1.00)
- Information Technology (1.00)
Semantic and Contextual Modeling for Malicious Comment Detection with BERT-BiLSTM
Fang, Zhou, Zhang, Hanlu, He, Jacky, Qi, Zhen, Zheng, Hongye
This study aims to develop an efficient and accurate model for detecting malicious comments, addressing the increasingly severe issue of false and harmful content on social media platforms. We propose a deep learning model that combines BERT and BiLSTM. The BERT model, through pre-training, captures deep semantic features of text, while the BiLSTM network excels at processing sequential data and can further model the contextual dependencies of text. Experimental results on the Jigsaw Unintended Bias in Toxicity Classification dataset demonstrate that the BERT+BiLSTM model achieves superior performance in malicious comment detection tasks, with a precision of 0.94, recall of 0.93, and accuracy of 0.94. This surpasses other models, including standalone BERT, TextCNN, TextRNN, and traditional machine learning algorithms using TF-IDF features. These results confirm the superiority of the BERT+BiLSTM model in handling imbalanced data and capturing deep semantic features of malicious comments, providing an effective technical means for social media content moderation and online environment purification.
- North America > United States > New York (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Henan Province > Zhengzhou (0.04)
MALCOM: Generating Malicious Comments to Attack Neural Fake News Detection Models
Le, Thai, Wang, Suhang, Lee, Dongwon
Therefore, to mitigate such problems, researchers have developed state-of-the-art (SOTA) models to autodetect fake news on social media using sophisticated data science and machine learning techniques. In this work, then, we ask "what if adversaries attempt to attack such detection models?" and investigate related issues by (i) proposing a novel attack scenario against fake news detectors, in which adversaries can post malicious comments toward news articles to mislead SOTA fake news detectors, and (ii) developing Malcom, an end-to-end adversarial comment generation framework to achieve such an attack. Through a comprehensive evaluation, we demonstrate that about 94% and 93.5% of the time on average Malcom can successfully mislead five of the latest neural detection models to always output targeted real and fake news labels. Furthermore, Malcom can also fool black box fake news detectors to always output real news labels 90% of the time on average. We also compare Real Comment: admitting im not going to read this (...) our attack model with four baselines across two real-world Malcom: hes a conservative from a few months ago datasets, not only on attack performance but also on generated Prediction Change: Real News Fake News quality, coherency, transferability, and robustness. We release the source code of Malcom at https://github.com/lethaiq/MALCOM
- North America > United States > Pennsylvania (0.04)
- Asia > Middle East > Jordan (0.04)
UBIC and Hearts United Group Partner to Launch New AI-Based Service to Identify Potential Risks from Online Data - NASDAQ.com
TOKYO, May 16, 2016 (GLOBE NEWSWIRE) -- UBIC, Inc. (Nasdaq:UBIC) (TSE:2158), a leading provider of international litigation support and big-data analysis services, and Hearts United Group Co., Ltd. announced today that on June 1, they will launch DH-AI, a next-generation system designed to detect potential signs of risk contained in comments and other content posted on the Internet using UBIC's KIBIT artificial intelligence (AI) engine. Since UBIC started engaging in joint research with Hearts United Group in October 2015, both companies set out to develop cutting-edge debugging technologies and services using AI. By leveraging their technical expertise, the companies have made steady progress researching AI-based debugging and are now preparing the service for commercialization. In recent years, many firms have launched community and blog websites as a channel to communicate with end users, so as to promote their products and services. Increasingly, malicious comments have been posted on such websites, which often serve to incite hostile exchanges or mislead customers about products and services, resulting in damage to the companies' public images.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.25)
- North America > United States (0.16)